PAC Analyses of a 'Similarity Learning' IBL Algorithm
نویسندگان
چکیده
Abs t r ac t . VS-CBR [14] is a simple instance-based learning algorithm that adjusts a weighted similarity measure as well as collecting cases. This paper presents a 'PAC' analysis of VS-CBR, motivated by the PAC learning framework, which demonstrates two main ideas relevant to the study of instance-based learners. Firstly, the hypothesis spaces of a learner on different target concepts can be compared to predict the difficulty of the target concepts for the learner. Secondly, it is helpful to consider the 'constituent parts' of an instance-based learner: to explore separately how many examples are needed to infer a good similarity measure and how many examples are needed for the case base. Applying these approaches, we show that VS-CBR learns quickly if most of the variables in the representation are irrelevant to the target concept and more slowly if there are more relevant variables. The paper relates this overall behaviour to the behaviour of the constituent parts of VS-CBR.
منابع مشابه
Instance-Based Prediction with Guaranteed Confidence
Instance-based learning (IBL) algorithms have proved to be successful in many applications. However, as opposed to standard statistical methods, a prediction in IBL is usually given without characterizing its confidence. In this paper, we propose an IBL method that allows for deriving set-valued predictions that cover the correct answer (label) with high probability. Our method makes use of a f...
متن کاملText Document Clustering based on Phrase
Affinity propagation (AP) was recently introduced as an unsupervised learning algorithm for exemplar based clustering. In this paper novel text document clustering algorithm has been developed based on vector space model, phrases and affinity propagation clustering algorithm. Proposed algorithm can be called Phrase affinity clustering (PAC). PAC first finds the phrase by ukkonen suffix tree con...
متن کاملOptimizing Nearest Neighbor Retrieval by Similarity Template and Retrieval Query Generation
The nearest neighbor algorithm is the most basic class of techniques in the subelds of machine learning such as case-based reasoning (CBR), memory-based reasoning (MBR), and instance-based learning (IBL). In the nearest neighbor algorithm, the computational cost of example retrieval is one of the most important issues. This paper proposes a novel technique for optimizing the nearest neighbor al...
متن کاملNoise-Tolerant Instance-Based Learning Algorithms
Several published reports show that instancebased learning algorithms yield high classification accuracies and have low storage requirements during supervised learning applications. However, these learning algorithms are highly sensitive to noisy training instances. This paper describes a simple extension of instancebased learning algorithms for detecting and removing noisy instances from conce...
متن کاملRefuting data aggregation arguments and how the IBL model stands criticism: A reply to Hills and Hertwig (2012)
Hills and Hertwig (2012) challenge the proposed similarity of the exploration-exploitation transitions found in Gonzalez and Dutt (2011) between the two experimental paradigms of decisions from experience (sampling and repeated-choice), which was predicted by an Instance-Based Learning (IBL) model. The heart of their argument is that in the sampling paradigm, an impression of reduced exploratio...
متن کامل